Autoregressive Model

Autoregressive model forecast the variable of interest using a linear combination of it’s past values.

An autoregressive model of order p AR(p) is given as \[y_t = c + ϕ_1y_{t-1} + ϕ_2y_{t-2} +.......+ ϕ_py_{t-p} + e_t\]

Changing the parameters will result in different time series pattern. The variance of the error term will only change the scale of the series not the pattern.

Special cases of AR model:-

  1. When \(ϕ_1\) = 0, then \(y_t\) is equivalent to white noise.

  2. When \(ϕ_1\) = 1 and c = 0, then \(y_t\) is equivalent to the random walk.

  3. When \(ϕ_1\) = 1 and \(c \ne 0\), then \(y_t\) is equivalent to the random walk with drift.

  4. When \(ϕ_1\) < 0, \(y_t\) tends to oscillate between positive and negative values;

We normally restrict autoregressive models to stationary data, in which case some constraints on the values of the parameters are required.

  1. For an AR(1) model:- -1 < \(ϕ_1\) < 1.

  2. For an AR(2) model:- -1 < \(ϕ_1\) < 1, \(ϕ_1+ ϕ_2\) < 1 and \(ϕ_2 - ϕ_1\)<1.

When p \(\ge\) 3, the restriction gets more complicated.

Moving Average Model

A moving average model uses past forecast errors as predictors in a linear regression type model.

A moving average model with order q MA(q) is given as \[y_t = c + e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2} + ............ + \theta_qe_{t-q}\] Note that it is not exactly a linear regression model as we do not observe \(e_t\).

This is called moving average model because each \(y_t\) can be think of as a weighted moving average of past forecast errors. It is different from moving average smoothing method as it trying forecast future values while MA smoothing tries to estimate trend cycle component.

It is possible to write a AR(p) model as MA(∞). The reverse is also true if we impose some constraints on the MA parameters. Then the MA model is called invertible. For eg. Consider MA(1) process, in AR(∞) process, the most recent errors can be written as linear function of current and past observations. \[e_t = \sum_{j=0}^{∞}(-\theta)^j y_{t-j}.\] When |\(\theta\)| > 1, the weights increase as lags increase, so the more distant the observations the greater their influence on the current error. When |\(\theta\)| = 1, the weights are constant in size, and the distant observations have the same influence as the recent observations. As neither of these situations make much sense, we require |\(\theta\)| < 1, so the most recent observations have higher weight than observations from the more distant past. Thus, the process is invertible when |\(\theta\)| < 1.

  1. For an MA(1) model:- -1 < \(\theta_1\) < 1.

  2. For an MA(2) model:- -1 < \(\theta_2\) < 1, \(\theta_2 + \theta_1\) < 1 and \(\theta_1 - \theta_2\)<1.

Non-seasonal ARIMA Models

If we combine differencing with autoregression and a moving average model, we obtain a non seasonal ARIMA model. The full model can be written as \[y_{t}^{'} = c + ϕ_1y_{t-1}^{'} + ϕ_2y_{t-2}^{'} +.......+ ϕ_py_{t-p} ^{'}+ \theta_1 e_{t-1} + \theta_2 e_{t-2} + ............ + \theta_qe_{t-q} + e_t\]

where \(y_{t}^{'}\) is the differenced series. The above can be called as ARIMA(p,d,q).

p = Order of autoregressive part.

d = degree of first differencing involved

q = order of moving average part.

Other models as special case of ARIMA models:-

  1. ARIMA(0,0,0) = white noise.

  2. ARIMA(0,1,0) with no constant = Random walk.

  3. ARIMA(0,1,0) with a constant = Random walk with no drift.

  4. ARIMA(p,0,0) = Autoregression.

  5. ARIMA(0,0,q) = Moving average model.

Now let us denote B as a backshift operator, then \(By_t = y_{t-1}\)

So one-step difference can be written as \((1-B)y_t\).

The ARIMA equation above can be written as \[AR(p) = (1 - ϕ_1B - ϕ_2B^2 - .... ϕ_pB^p)\] \[MA(q) = c + (1 + \theta_1B + \theta_2B^2 + .... + \theta_qB^q)e_t\]

then ARIMA equation is \[(1 - ϕ_1B - ϕ_2B^2 - .... ϕ_pB^p) (1-B)^dy_t = c+ (1+\theta_1B + \theta_2B^2 + .... + \theta_qB^q)e_t\] Now selecting p,q,d can be very difficult, so we use the function auto.arima to do it.

Example :-

library(fpp2)
## Loading required package: ggplot2
## Loading required package: forecast
## Loading required package: fma
## Loading required package: expsmooth
autoplot(uschange[,"Consumption"]) +
  xlab("Year") + ylab("Quarterly percentage change")

fit = auto.arima(uschange[,"Consumption"], seasonal = FALSE)
summary(fit)
## Series: uschange[, "Consumption"] 
## ARIMA(2,0,2) with non-zero mean 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2    mean
##       1.3908  -0.5813  -1.1800  0.5584  0.7463
## s.e.  0.2553   0.2078   0.2381  0.1403  0.0845
## 
## sigma^2 estimated as 0.3511:  log likelihood=-165.14
## AIC=342.28   AICc=342.75   BIC=361.67
## 
## Training set error measures:
##                        ME     RMSE      MAE      MPE    MAPE     MASE
## Training set 0.0007456313 0.584581 0.432925 49.21967 170.739 0.678305
##                     ACF1
## Training set 0.009876658

This is a ARIMA(2,0,2) model. \[y_t = c + 1.3908 y_{t-1} - 0.5813 y_{t-2} + e_t - 1.18 e_{t-1} + 0.5584 e_{t-2}\] where c = 0.7463 * (1 - 1.3908 + 0.5813) = 0.1421701. \(e_t\) is the white noise with standard deviation of \(\sqrt{0.3511} = 0.5925\).

Forecast is

fit %>% forecast(h = 10) %>% autoplot(include = 80)

The constant c has an important effect on long term forecasts.

  1. If c = 0 and d = 0, the long-term forecasts will go to zero.

  2. If c = 0 and d = 1, the long-term forecasts will go to a constant.

  3. If c = 0 and d = 2, the long-term forecasts will follow a straight line.

  4. If c \(\ne\) 0 and d = 0, the long-term forecasts will go to mean of the data.

  5. If c \(\ne\) 0 and d = 1, the long-term forecasts will follow a straight line.

  6. If c \(\ne\) 0 and d = 2, the long-term forecasts will follow a quadratic trend.

The value of d also has an effect on the prediction intervals — the higher the value of d, the more rapidly the prediction intervals increase in size. For d = 0, the long-term forecast standard deviation will go to the standard deviation of the historical data, so the prediction intervals will all be essentially the same.

The value of p is important if the data show cycles. To obtain cyclic forecasts, it is necessary to have p \(\ge\) 2, along with some additional conditions on the parameters.

ACF and PACF plots

One way to find out the parameters p and q is ACF plots. ACF plots gives the correlation between lag values. One issue with ACF plot is if \(y_t\) and \(y_{t−1}\) are correlated, then \(y_{t−1}\) and \(y_{t−2}\) must also be correlated. However, then \(y_t\) and \(y_{t−2}\) might be correlated, simply because they are both connected to \(y_{t−1}\), rather than any information contains in \(y_{t-2}\) that could be used in forecasting \(y_t\).

To overcome this we use partial autocorrelations. These measures the relationship between \(y_t\) and \(y_{t-k}\) after removing the effects of lags 1,2,3…., k-1. Each partial autocorrelation can be estiamted as the last coefficient in an autoregressive model.

Specifically, \(α_k\), the kth partial autocorrelation coefficient, is equal to the estimate of \(ϕ_k\) in an AR(k) model.

If the data are from ARIMA(0,d,q) or ARIMA(p,d,0) model, then the ACF and PACF plots can be helpful in determining the value of p or q. But if both p and q are positive, then the plots do not help in finding suitable values of p and q.

The data may follow an ARIMA(p,d,0) model if

  1. ACF is exponentially decaying or sinusoidal.

  2. there is a significant spike at lag p in the PACF, but none beyond lag p.

The data may follow an ARIMA(0,d,q) model if

  1. PACF is exponentially decaying or sinusoidal.

  2. there is a significant spike at lag q in the ACF, but none beyond lag q.

ggAcf(uschange[,"Consumption"], main = "")

ggPacf(uschange[,"Consumption"], main = "")

The PACF plot above shows that PACF plot has decreasing trend and ACF has a spike at 3 and then no other spike which shows that ARIMA(3,0,0) might work for this data.

fit2 = Arima(uschange[,"Consumption"], order = c(3,0,0))
summary(fit2)
## Series: uschange[, "Consumption"] 
## ARIMA(3,0,0) with non-zero mean 
## 
## Coefficients:
##          ar1     ar2     ar3    mean
##       0.2274  0.1604  0.2027  0.7449
## s.e.  0.0713  0.0723  0.0712  0.1029
## 
## sigma^2 estimated as 0.3494:  log likelihood=-165.17
## AIC=340.34   AICc=340.67   BIC=356.5
## 
## Training set error measures:
##                       ME      RMSE       MAE      MPE     MAPE      MASE
## Training set 0.001106571 0.5847286 0.4294193 47.28068 171.0212 0.6728123
##                    ACF1
## Training set 0.01499451

This model is giving a slightly better result then the model selected by auto.arima function. This is because the auto.arima function do not use all the possible model in its search.

fit3 = auto.arima(uschange[,"Consumption"], seasonal = FALSE, stepwise = FALSE, approximation = FALSE)
summary(fit3)
## Series: uschange[, "Consumption"] 
## ARIMA(3,0,0) with non-zero mean 
## 
## Coefficients:
##          ar1     ar2     ar3    mean
##       0.2274  0.1604  0.2027  0.7449
## s.e.  0.0713  0.0723  0.0712  0.1029
## 
## sigma^2 estimated as 0.3494:  log likelihood=-165.17
## AIC=340.34   AICc=340.67   BIC=356.5
## 
## Training set error measures:
##                       ME      RMSE       MAE      MPE     MAPE      MASE
## Training set 0.001106571 0.5847286 0.4294193 47.28068 171.0212 0.6728123
##                    ACF1
## Training set 0.01499451

This time it found the exact same model.

Estimating the model

Maximum likelihood estimation Once the model is decided (values of p, d and q), we estimate the parameters \(c, ϕ_1, ϕ_2,....ϕ_p, \theta_1,..... \theta_q\) by maximizing the likelihood of the data. This technique is called maximum likelihood estimation (MLE). This technique finds the values of the parameters which maximise the probability of obtaining the data that we have observed.

Information Criteria AIC is useful in selecting predictors for ARIMA model. \[AIC = -2 log(L) + 2(p+q+k+1)\] where L is the likelihood of the data.

Corrected AIC can be written as \[AICc = AIC + \frac{2(p+k+q+1)(p+k+q+2)}{T-p-q-k-2}\] and Bayesian Information criteria as

\[BIC = AIC + [log(T)-2](p+k+q+1)\]

How auto.arima work by defualt is

  1. Use KPSS test to determine the differencing (\(0\le d \ge 2\)) required for the data.

  2. After differencing the data, choose the value of p and q by minimising AICc. To shorten this process the function uses a stepwise search approach. Four intial models are fitted :- (I).ARIMA(0,d,0) (II). ARIMA(2,d,2) (III). ARIMA(1,d,0) (IV). ARIMA(0,d,1)

A constant is included unless if d = 2, otherwise another model is fitted as ARIMA(0,d,0) without a constant.

  1. The best model is selected that minimize AICc.

  2. Variations on current model are considered like varying p and q by \(\pm\) 1 and including/excluding constant from the current model.

  3. The new model that minimise AICc becomes the best current model.

Modeling Procedure

See the chart “arimaflowchart” in the repository.

Example Electrical Equipment orders We need to first adjust seasonality before we apply ARIMA.

elecequip %>% stl(s.window = "periodic") %>% seasadj() -> eeadj
autoplot(eeadj)

The data is now seasonally adjusted and all we see is a cyclic variation. The data need not to be adjusted or transformed as variance is similar all over the time period.

The data is not stationary as there are ups and downs in the series for a long period of time. We will see if differencing can help solving that.

eeadj %>% diff() %>% ggtsdisplay(main = "")

The differenced series looks stationary except there are some instance where we have high peaks and troughs. So we will not consider anymore differencing.

PACF shows that there are spike till lag 3 and no significant spike after that. ACF plot also spikes at 1 but after that no consistency and gives a confusing look. We can try some iterations over ARIMA(3,1,0), ARIMA(3,1,1). Some other iterations can also be tried.

fit = Arima(eeadj, order = c(3,1,0))
summary(fit)
## Series: eeadj 
## ARIMA(3,1,0) 
## 
## Coefficients:
##           ar1      ar2     ar3
##       -0.3418  -0.0426  0.3185
## s.e.   0.0681   0.0725  0.0682
## 
## sigma^2 estimated as 9.639:  log likelihood=-493.8
## AIC=995.6   AICc=995.81   BIC=1008.67
## 
## Training set error measures:
##                      ME     RMSE      MAE         MPE     MAPE     MASE
## Training set 0.03546433 3.072732 2.387717 -0.01504649 2.513219 0.292178
##                     ACF1
## Training set -0.03026898
fit_2 = Arima(eeadj, order = c(3,1,1))
summary(fit_2)
## Series: eeadj 
## ARIMA(3,1,1) 
## 
## Coefficients:
##          ar1     ar2     ar3      ma1
##       0.0044  0.0916  0.3698  -0.3921
## s.e.  0.2201  0.0984  0.0669   0.2426
## 
## sigma^2 estimated as 9.577:  log likelihood=-492.69
## AIC=995.38   AICc=995.7   BIC=1011.72
## 
## Training set error measures:
##                     ME     RMSE      MAE          MPE     MAPE    MASE
## Training set 0.0328818 3.054718 2.357169 -0.006470086 2.481603 0.28844
##                     ACF1
## Training set 0.008980716

ARIMA(3,1,1) is giving a slightly better result.

checkresiduals(fit_2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(3,1,1)
## Q* = 24.034, df = 20, p-value = 0.2409
## 
## Model df: 4.   Total lags used: 24

The residuals from ARIMA(3,1,1) shows no significant autocorrelation. Also, the residual plot looks like a white noise. The LB test also suggest the same.

autoplot(forecast(fit_2))

Plotting the characteristic roots We can write ARIMA model equation as \[(1 - ϕ_1B - ϕ_2B^2 - .... ϕ_pB^p) (1-B)^dy_t = c+ (1+\theta_1B + \theta_2B^2 + .... + \theta_qB^q)e_t\] $(ϕ)B = (1 - ϕ_1B - ϕ_2B^2 - …. ϕ_pB^p) $ is the pth order polynomial in B.

$()B = (1 - _1B - _2B^2 - …. _pB^p) $ is the qth order polynomial in B.

The stationarity condition for the model requires the complex root of \((ϕ)B\) to be outside the unit circle and hence inverse root to be inside the unit circle.

The invertibility condition for the model requires the complex root of \((\theta)B\) to be outside the unit circle and hence inverse root to be inside the unit circle.

We can check that as

autoplot(fit_2)

The inverse roots are inside the unit circle. If roots are near the unit circle or outside of it then the forecasts won’t be stable.

Forecasts from ARIMA

Now we will see how ARIMA produce forecasts. We will consider the model that we have built above ARIMA(3,1,1). The equation for this model can be written as \[(1 - ϕ_1B - ϕ_2B^2 - ϕ_3B^3) (1-B)y_t = (1+\theta_1B )e_t\] The value of these parameters is in the output above. The above equation can be expanded as \[[1 - (1 + ϕ_1)B +(ϕ_1 + ϕ_2)B^2 + (ϕ_2 - ϕ_3)B^3 + ϕ_3B^4]y_t = (1 + \theta_1B)e_t\] and replacing the backshift operator \[y_t - (1 + ϕ_1)y_{t-1} +(ϕ_1 + ϕ_2)y_{t-2}+ (ϕ_2 - ϕ_3)y_{t-3} + ϕ_3y_{t-4} = e_t + \theta_1e_{t-1}\] \[y_t = (1 + ϕ_1)y_{t-1} - (ϕ_1 + ϕ_2)y_{t-2} - (ϕ_2 - ϕ_3)y_{t-3} - ϕ_3y_{t-4} + e_t + \theta_1e_{t-1}\] Now repalce t with T+1. \(e_{T+1}\) will be zero and \(e_T\) will be replaced with last observed residual. \[y_{T+1|T} = (1 + ϕ_1)y_{T} - (ϕ_1 + ϕ_2)y_{T-1} - (ϕ_2 - ϕ_3)y_{T-2} - ϕ_3y_{T-3} + \theta_1e_T\] A forecast for T+2, and so on can be obtained in a similar way.

Excercise

1. Generating Models To generate data from an AR(1) model with \(ϕ_1 = 0.6\) and variance as 1. The process starts with \(y_1\) = 0.

y = ts(numeric(100))
e = rnorm(100)
for(i in 2:100)
  y[i] = 0.6*y[i-1] + e[i]
autoplot(y)

We will see how the plot changes with ϕ.

ar1generator = function(phi){
  y = ts(numeric(100))
  e = rnorm(100)
  for(i in 2:100)
    y[i] = phi*y[i-1] + e[i]
  return(y)
}

autoplot(ar1generator(0.3), series = "0.3") +
  autolayer(ar1generator(0.6), series = "0.6") +
  autolayer(ar1generator(0.9), series = "0.9") +
  ylab("AR(1) Models") +
  guides(colour = guide_legend(title = "Phi1"))

As the value of phi increase, variation increases. This just represents that high value of phi represents that the models can take high adjustments.

Now we will generate data from MA(1) process.

ma1_generator = function(theta){
  y = ts(numeric(100))
  e = rnorm(100)
  for(i in 2:100)
    y[i] = theta * e[i-1] + e[i]
  return(y)
}  
autoplot(ma1_generator(0.3), series = "0.3") +
  autolayer(ma1_generator(0.6), series = "0.6") +
  autolayer(ma1_generator(0.9), series = "0.9") +
  ylab("MA(1) Models") +
  guides(colour = guide_legend(title = "theta1"))

The same conclusion as AR(1) model. Now we will generate data from ARIMA(1,0,1) with phi = 0.6 and theta = 0.6.

arima_101 = ts(numeric(50))
e = rnorm(50)
for(i in 2:50)
  arima_101[i] = 0.6*arima_101[i-1] + 0.6*e[i-1] + e[i]
autoplot(arima_101)

Now we will generate data from AR(2) with \(ϕ_1\) = -0.8 and \(ϕ_2\) = 0.3.

ar2_generator = ts(numeric(100))
e = rnorm(100)
for(i in 3:100)
  ar2_generator[i] = -0.8 * ar2_generator[i-1] + 0.3 * ar2_generator[i-2] + e[i]
autoplot(ar2_generator)

autoplot(arima_101, series = "ARMA(1, 1)") +
  autolayer(ar2_generator, series = "AR(2)") +
  ylab("y") +
  guides(colour = guide_legend(title = "Models"))

autoplot(arima_101)

Data from an AR(2) model increased with oscillation. They are non-staionary data. But data from an ARMA(1, 1) model were stationary.

7. Fitting ARIMA Model

We will explore wmurders (No. of women murders data)

autoplot(wmurders) +
  ylab("No. of murders") + ggtitle("No. of women murders data")

The data shows no seasonality and no consistent trend, but the series is not stationary either. We can take a differencing and check if that makes it stationary.

# wmurders %>% BoxCox(lambda = BoxCox.lambda(wmurders)) %>% diff() %>% ggtsdisplay()
wmurders %>% diff() %>% ggtsdisplay()

The data still do not look stationary. It shows some increasing variance with time. We can use ndiff function to find out the right differencing

ndiffs(wmurders)
## [1] 2

This shows that data need 2 differencing.

wmurders %>% diff(differences = 2) %>% ggtsdisplay(main = "")

library(urca)
wmurders %>% diff(differences = 2) %>% ur.kpss() %>% summary()
## 
## ####################### 
## # KPSS Unit Root Test # 
## ####################### 
## 
## Test is of type: mu with 3 lags. 
## 
## Value of test-statistic is: 0.0458 
## 
## Critical value for a significance level of: 
##                 10pct  5pct 2.5pct  1pct
## critical values 0.347 0.463  0.574 0.739

2nd differencing makes the data stationary. Now, PACF plot shows sigificant spike at lag 1. ACF plot also shows spikes at lag 1 and 2. We can start with ARIMA(1,2,0) or ARIMA(0,2,2) or we can try both onw by one and see if it makes the ACF and PACF plots more reasonable.

Now, as we are using 2nd order differencing, we should avoid using a constant in the model as it will yield quadratic trend, which might create problems while forecasting.

We can write ARIMA(1,2,0) as \[(1 - ϕ_1B) (1-B)^2y_t = e_t\]

wmurders_arima_120 = Arima(wmurders, order = c(1,2,0))
summary(wmurders_arima_120)
## Series: wmurders 
## ARIMA(1,2,0) 
## 
## Coefficients:
##           ar1
##       -0.6719
## s.e.   0.0981
## 
## sigma^2 estimated as 0.05471:  log likelihood=2
## AIC=0   AICc=0.24   BIC=3.94
## 
## Training set error measures:
##                        ME      RMSE       MAE         MPE     MAPE
## Training set -0.001376898 0.2274352 0.1777919 0.001486708 5.080777
##                  MASE       ACF1
## Training set 1.093337 -0.1593845

We can also try ARIMA(0,2,2) model.

wmurders_arima_022 = Arima(wmurders, order = c(0,2,2))
summary(wmurders_arima_022)
## Series: wmurders 
## ARIMA(0,2,2) 
## 
## Coefficients:
##           ma1     ma2
##       -1.0181  0.1470
## s.e.   0.1220  0.1156
## 
## sigma^2 estimated as 0.04702:  log likelihood=6.03
## AIC=-6.06   AICc=-5.57   BIC=-0.15
## 
## Training set error measures:
##                      ME      RMSE       MAE        MPE     MAPE      MASE
## Training set -0.0113461 0.2088162 0.1525773 -0.2403396 4.331729 0.9382785
##                     ACF1
## Training set -0.05094066

ARIMA(0,2,2) gives smaller AICc. Now we will look at the residuals

checkresiduals(wmurders_arima_022)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,2,2)
## Q* = 11.764, df = 8, p-value = 0.1621
## 
## Model df: 2.   Total lags used: 10
Pacf(residuals(wmurders_arima_022))

The residuals shows mean as 0 but variance is not looking constant. ACF plot looks good. Also, PAcf plot do not show any spike either.

Let’s see if auto.arima arrives at the same model.

auto.arima(wmurders, seasonal = FALSE, stepwise = FALSE, approximation = FALSE)
## Series: wmurders 
## ARIMA(0,2,3) 
## 
## Coefficients:
##           ma1     ma2      ma3
##       -1.0154  0.4324  -0.3217
## s.e.   0.1282  0.2278   0.1737
## 
## sigma^2 estimated as 0.04475:  log likelihood=7.77
## AIC=-7.54   AICc=-6.7   BIC=0.35

ARIMA(0,2,3) gives better AICc and BIC. This means that there can be more than 1 model that satisfies the condtions that we target on residuals.

Now we will calculate the forecast manually. Formula \[y_t = 2y_{t-1} - y_{t-2} + e_t + \theta_1 e_{t-1} + \theta_2 e_{t-2}\]

years = length(wmurders)
e = residuals(wmurders_arima_022)
fc1 = 2*wmurders[years] - wmurders[years-1] + -1.0181 * e[years] + 0.1470 * e[years-1]
fc2 = 2*fc1 - wmurders[years] + 0.1470 * e[years]
fc3 = 2*fc2 - fc1
c(fc1, fc2, fc3)
## [1] 2.480523 2.374887 2.269252
forecast(wmurders_arima_022, h = 3)
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005       2.480525 2.202620 2.758430 2.055506 2.905544
## 2006       2.374890 1.985422 2.764359 1.779250 2.970531
## 2007       2.269256 1.772305 2.766206 1.509235 3.029276

Manually calculated forecasts are very similar to manually calculated forecast.

forecast(wmurders_arima_022, h = 3) %>% autoplot()

8. Fitting an ARIMA Model

We will compare the effect of adding drift in the model. Total international visitors to Australia (in millions)

autoplot(austa) + 
  xlab("Year") + ylab("No. of visitors") +
  ggtitle("Total international visitors to Australia")

We will use auto.arima function to find the appropriate model.

austa.arima = auto.arima(austa)
summary(austa.arima)
## Series: austa 
## ARIMA(0,1,1) with drift 
## 
## Coefficients:
##          ma1   drift
##       0.3006  0.1735
## s.e.  0.1647  0.0390
## 
## sigma^2 estimated as 0.03376:  log likelihood=10.62
## AIC=-15.24   AICc=-14.46   BIC=-10.57
## 
## Training set error measures:
##                        ME      RMSE       MAE       MPE     MAPE      MASE
## Training set 0.0008313383 0.1759116 0.1520309 -1.069983 5.513269 0.7461559
##                      ACF1
## Training set -0.000571993
autoplot(forecast(austa.arima, h = 10))

ARIMA(0,1,1) with a constant. Let’s see if the residuals looks like white noise.

checkresiduals(austa.arima)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1) with drift
## Q* = 2.297, df = 5.2, p-value = 0.8266
## 
## Model df: 2.   Total lags used: 7.2

The residuals looks like white noise. We can fit a simpler model with no drift anc compare the results with the above model.

austa.arima_011 = Arima(austa, order= c(0,1,1))
autoplot(forecast(austa.arima_011 , h = 10))

We can see that a drift model obviously makes sense as there is a strong trend component in the model.

Let’s remove the MA part as well and compare

austa.arima_010 = Arima(austa, order= c(0,1,0))
autoplot(forecast(austa.arima_010 , h = 10))

The result from the last model is like naive forecasts. The forecast from the model with MA part are little higher as it is adjusting for the last forecasting as well. The prediction interval from the 2nd model is higher that is probably because we are using one more parameter.

austa.arima_213 = Arima(austa, order= c(2,1,3), include.drift = TRUE)
autoplot(forecast(austa.arima_213 , h = 10))

Now, in the above model as we have 2 AR and MA part. The constant will result in cubic trend. The forecasts are increasing but the rate of increase is getting smaller with time.

Forecasts from an ARIMA(0,0,1) model with a constant.

austa.arima_001 = Arima(austa, order= c(0,0,1), include.constant = TRUE)
autoplot(forecast(austa.arima_001 , h = 10))

The forecasts are fastly decreased to the mean of the data history. This happened at with increasing forecast horizon we make error terms zero.

austa.arima_000 = Arima(austa, order= c(0,0,0), include.constant = TRUE)
autoplot(forecast(austa.arima_000 , h = 10))

If we remove the MA part as well. The forecasts will simply be the mean of the time series.

Forecasts from an ARIMA(0,2,1) model with no constant

austa.arima_021 = Arima(austa, order= c(0,2,1), include.constant = TRUE)
autoplot(forecast(austa.arima_021 , h = 10))

The rate of increase in forecasts is very high. PI is being larger for the farther future forecast.